347 research outputs found
Guaranteed robustness properties of multivariable, nonlinear, stochastic optimal regulators
The robustness of optimal regulators for nonlinear, deterministic and stochastic, multi-input dynamical systems is studied under the assumption that all state variables can be measured. It is shown that, under mild assumptions, such nonlinear regulators have a guaranteed infinite gain margin; moreover, they have a guaranteed 50 percent gain reduction margin and a 60 degree phase margin, in each feedback channel, provided that the system is linear in the control and the penalty to the control is quadratic, thus extending the well-known properties of LQ regulators to nonlinear optimal designs. These results are also valid for infinite horizon, average cost, stochastic optimal control problems
Differentially Private Distributed Optimization
In distributed optimization and iterative consensus literature, a standard
problem is for agents to minimize a function over a subset of Euclidean
space, where the cost function is expressed as a sum . In this paper,
we study the private distributed optimization (PDOP) problem with the
additional requirement that the cost function of the individual agents should
remain differentially private. The adversary attempts to infer information
about the private cost functions from the messages that the agents exchange.
Achieving differential privacy requires that any change of an individual's cost
function only results in unsubstantial changes in the statistics of the
messages. We propose a class of iterative algorithms for solving PDOP, which
achieves differential privacy and convergence to the optimal value. Our
analysis reveals the dependence of the achieved accuracy and the privacy levels
on the the parameters of the algorithm. We observe that to achieve
-differential privacy the accuracy of the algorithm has the order of
Qualitative properties of -fair policies in bandwidth-sharing networks
We consider a flow-level model of a network operating under an -fair
bandwidth sharing policy (with ) proposed by Roberts and
Massouli\'{e} [Telecomunication Systems 15 (2000) 185-201]. This is a
probabilistic model that captures the long-term aspects of bandwidth sharing
between users or flows in a communication network. We study the transient
properties as well as the steady-state distribution of the model. In
particular, for , we obtain bounds on the maximum number of flows
in the network over a given time horizon, by means of a maximal inequality
derived from the standard Lyapunov drift condition. As a corollary, we
establish the full state space collapse property for all . For the
steady-state distribution, we obtain explicit exponential tail bounds on the
number of flows, for any , by relying on a norm-like Lyapunov
function. As a corollary, we establish the validity of the diffusion
approximation developed by Kang et al. [Ann. Appl. Probab. 19 (2009)
1719-1780], in steady state, for the case where and under a local
traffic condition.Comment: Published in at http://dx.doi.org/10.1214/12-AAP915 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Optimization of Multiclass Queueing Networks: Polyhedral and Nonlinear Characterizations of Achievable Performance
We consider open and closed multiclass queueing networks with Poisson arrivals (in open networks), exponentially distributed class dependent service times, and with class dependent deterministic or probabilistic routing. For open networks, the performance objective is to minimize, over all sequencing and routing policies, a weighted sum of the expected response times of different classes. Using a powerful technique involving quadratic or higher order potential functions, we propose variants of a method to derive polyhedral and nonlinear spaces which contain the entire set of achievable response times under stable and preemptive scheduling policies. By optimizing over these spaces, we obtain lower bounds on achievable performance. In particular, we obtain a sequence of progressively more complicated nonlinear approximations (relaxations) which are progressively closer to the exact achievable space. In the special case of single station networks (multiclass queues and Klimov's model) and homogenous multiclass networks, our characterization gives exactly the achievable region. Consequently, the proposed method can be viewed as the natural extension of conservation laws to multiclass queueing networks. For closed networks, the performance objective is to maximize throughput. We similarly find polyhedral and nonlinear spaces that include the performance space and by maximizing over these spaces we obtain an upper bound on the optimal throughput. We check the tightness of our bounds by simulating heuristic scheduling policies for simple open networks and we find that the first order approximation of our method is at least as good as simulation-based existing methods. In terms of computational complexity and in contrast to simulation-based existing methods, the calculation of our first order bounds consists of solving a linear programming problem with both the number of variables and constraints being polynomial (quadratic) in the number of classes in the network. The i-th order approximation involves solving a convex programming problem in dimension O(Ri+l), where R is the number of classes in the network, which can be solved efficiently using techniques from semi-definite programming
Two semi-Lagrangian fast methods for Hamilton-Jacobi-Bellman equations
In this paper we apply the Fast Iterative Method (FIM) for solving general
Hamilton-Jacobi-Bellman (HJB) equations and we compare the results with an
accelerated version of the Fast Sweeping Method (FSM). We find that FIM can be
indeed used to solve HJB equations with no relevant modifications with respect
to the original algorithm proposed for the eikonal equation, and that it
overcomes FSM in many cases. Observing the evolution of the active list of
nodes for FIM, we recover another numerical validation of the arguments
recently discussed in [Cacace et al., SISC 36 (2014), A570-A587] about the
impossibility of creating local single-pass methods for HJB equations
A Dynamic Programming Approach to Adaptive Fractionation
We conduct a theoretical study of various solution methods for the adaptive
fractionation problem. The two messages of this paper are: (i) dynamic
programming (DP) is a useful framework for adaptive radiation therapy,
particularly adaptive fractionation, because it allows us to assess how close
to optimal different methods are, and (ii) heuristic methods proposed in this
paper are near-optimal, and therefore, can be used to evaluate the best
possible benefit of using an adaptive fraction size.
The essence of adaptive fractionation is to increase the fraction size when
the tumor and organ-at-risk (OAR) are far apart (a "favorable" anatomy) and to
decrease the fraction size when they are close together. Given that a fixed
prescribed dose must be delivered to the tumor over the course of the
treatment, such an approach results in a lower cumulative dose to the OAR when
compared to that resulting from standard fractionation. We first establish a
benchmark by using the DP algorithm to solve the problem exactly. In this case,
we characterize the structure of an optimal policy, which provides guidance for
our choice of heuristics. We develop two intuitive, numerically near-optimal
heuristic policies, which could be used for more complex, high-dimensional
problems. Furthermore, one of the heuristics requires only a statistic of the
motion probability distribution, making it a reasonable method for use in a
realistic setting. Numerically, we find that the amount of decrease in dose to
the OAR can vary significantly (5 - 85%) depending on the amount of motion in
the anatomy, the number of fractions, and the range of fraction sizes allowed.
In general, the decrease in dose to the OAR is more pronounced when: (i) we
have a high probability of large tumor-OAR distances, (ii) we use many
fractions (as in a hyper-fractionated setting), and (iii) we allow large daily
fraction size deviations.Comment: 17 pages, 4 figures, 1 tabl
Evolutionary game of coalition building under external pressure
We study the fragmentation-coagulation (or merging and splitting)
evolutionary control model as introduced recently by one of the authors, where
small players can form coalitions to resist to the pressure exerted by the
principal. It is a Markov chain in continuous time and the players have a
common reward to optimize. We study the behavior as grows and show that the
problem converges to a (one player) deterministic optimization problem in
continuous time, in the infinite dimensional state space
Pseudorehearsal in value function approximation
Catastrophic forgetting is of special importance in reinforcement learning,
as the data distribution is generally non-stationary over time. We study and
compare several pseudorehearsal approaches for Q-learning with function
approximation in a pole balancing task. We have found that pseudorehearsal
seems to assist learning even in such very simple problems, given proper
initialization of the rehearsal parameters
- …